Bayesian Algorithms for Causal Data Mining

نویسندگان

  • Subramani Mani
  • Constantin F. Aliferis
  • Alexander R. Statnikov
چکیده

We present two Bayesian algorithms CD-B and CD-H for discovering unconfounded cause and effect relationships from observational data without assuming causal sufficiency which precludes hidden common causes for the observed variables. The CD-B algorithm first estimates the Markov blanket of a node X using a Bayesian greedy search method and then applies Bayesian scoring methods to discriminate the parents and children of X . Using the set of parents and set of children CD-B constructs a global Bayesian network and outputs the causal effects of a node X based on the identification of Y arcs. Recall that if a node X has two parent nodes A,B and a child node C such that there is no arc between A,B and A,B are not parents of C, then the arc from X to C is called a Y arc. The CD-H algorithm uses the MMPC algorithm to estimate the union of parents and children of a target node X . The subsequent steps are similar to those of CD-B. We evaluated the CD-B and CD-H algorithms empirically based on simulated data from four different Bayesian networks. We also present comparative results based on the identification of Y structures and Y arcs from the output of the PC, MMHC and FCI algorithms. The results appear promising for mining causal relationships that are unconfounded by hidden variables from observational data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of Association Rule Discovery and Bayesian Network Causal Inference Algorithms to Discover Relationships in Discrete Data

Association rules discovered through attribute-oriented induction are commonly used in data mining tools to express relationships between variables. However, causal inference algorithms discover more concise relationships between variables, namely, relations of direct cause. These algorithms produce regressive structured equation models for continuous linear data and Bayes networks for discrete...

متن کامل

A Simulation Study of Three Related Causal Data Mining Algorithms

In all scienti c domains causality plays a signi cant role. This study focused on evaluating and re ning e cient algorithms to learn causal relationships from observational data. Evaluation of learned causal output is di cult, due to lack of a gold standard in real-world domains. Therefore, we used simulated data from a known causal network in a medical domain|the Alarm network. For causal disc...

متن کامل

An Introduction to Inference and Learning in Bayesian Networks

Bayesian networks (BNs) are modern tools for modeling phenomena in dynamic and static systems and are used in different subjects such as disease diagnosis, weather forecasting, decision making and clustering. A BN is a graphical-probabilistic model which represents causal relations among random variables and consists of a directed acyclic graph and a set of conditional probabilities. Structure...

متن کامل

Comparison of Four Data Mining Algorithms for Predicting Colorectal Cancer Risk

Background and Objective: Colorectal cancer (CRC) is one of the most prevalent malignancies in the world. The early detection of CRC is not only a simple process, but it is also the key to its treatment. Given that data mining algorithms could be potentially useful in cancer prognosis, diagnosis, and treatment, the main focus of this study is to measure the performance of some data mining class...

متن کامل

Scalable Techniques for Mining Causal

Mining for association rules in market basket data has proved a fruitful area of research. Measures such as conditional probability (conndence) and correlation have been used to infer rules of the form \the existence of item A implies the existence of item B." However, such rules indicate only a statistical relationship between A and B. They do not specify the nature of the relationship: whethe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010